FPGA Flows To Shift
FPGA vendors, in their long list of claimed advantages over ASICs, often point to the relative simplicity of their design flows. But this claim has evolved over the years, and recent discussions with leading vendors suggest that it will keep on shifting.
For instance, FPGA vendors used to claim that ASIC-style verification was simply unnecessary in FPGAs. You already knew that the underlying silicon was correct, so the only question was whether you had programmed in the correct logic. With the growth in complexity of FPGAs, now it is accepted that at least simulation is a necessary part of FPGA verification on large designs.
The next case in point was physical synthesis. When these tools first appeared in deep-submicron ASIC flows, FPGA vendors gloated over how FPGAs, with their highly predictable, more or less placement-independent timing, made such approaches unnecessary. But then Synplicity announced what amounts to physical synthesis for FPGAs: an iterative approach that requests the designer to group code into likely looking modules, does an initial synthesis to identify critical nets and eventually does detailed placement in critical regions of the design.
The next item on the list may be more of an issue. FPGA vendors have always gloated that signal-integrity considerations were irrelevant to FPGA users. But at least one FPGA vendor is now showing signal-integrity tools on its technology road map. It is still true, designers say, that they could impose strict enough design rules on the programmable interconnect of an FPGA to prevent signal-integrity problems for any combination of signals and routing. But those rules would be so restrictive in advanced processes that they would start to influence other issues, such as performance and die size, far too much. So the FPGA vendors have taken the same route that process developers did: They have relaxed the rules, and provide tools to warn folks when they are about to get into trouble.
The problem is that the tools may lag the problem by a generation. Sources say that it is already possible on existing devices to design very high-speed circuits-especially with some of the very fast I/O available on the latest parts-that will have signal-integrity problems. And as yet there are no analysis tools to steer users out of trouble. It can take a great deal of experience to get some of these high-speed blocks working properly, according to vendor sources.
On-Chip Debug Mania
Before microprocessors became cores and disappeared into the guts of SoCs, there was a well-developed industry supplying microprocessor emulators and processor-specific logic analyzers to development teams. These systems could recognize quite complex events defined by Boolean combinations of signals occurring sequentially in other parts of the system. They could infer things going on inside the microprocessor such as instruction execution cycles, register writes and cache hits and they could control very wide and deep trace buffers based on these complex events. All these capabilities were added because they were essential to software debug.
Early on in the SoC world, these problems were adequately addressed by simulation. As CPUs got faster and code larger, simulation was too slow. So most microprocessor cores evolved internal debug hardware. But these internal debug monitors have never had the kinds of capabilities that full-blown development systems had. That was a problem for software developers, because it left a gap between the excellent visibility at one instruction per second from simulation, and the mediocre debug capability at full speed from the embedded engine. Very expensive solutions such as ASIC emulators could provide some middle ground, but still not full-speed operation.
One processor core vendor, ARC Cores Ltd., has proposed an interesting question: if you could have all the real-time, on-chip debug capability you wanted, but you had to give up die area for it, how much would be enough? The company posed this question by way of a highly configurable debug module-essentially an on-chip logic analyzer with complex event triggers, visibility into the heart of the ARC processor core and a potentially very large trace buffer.
An additional parameter has to be entered into this question: How are you going to get data off the chip as it's being collected? ARC addressed this problem with a flow-through data compression unit to reduce the bandwidth of the 120 signals that define the state of the core, and with real-time data filters that avoid capturing uninformative events.
This may sound like heaven for software developers: a chance to trigger on a complex combination of CPU state and external events; start and stop trace of selected events in response to the triggers; and all the other wonderful stuff that microprocessor emulators (and C++ programs in conjunction with ASIC emulation) can do. But of course it costs die area.
The traditional view is that when push comes to shove, die area wins. The conventional view, according to ARC, is wrong. The company reports that some customers are willing to double the CPU core area in order to get all the capability they want. Others will accept less capability with less area penalty. In some cases the customer wants to do two tapeouts, one with and one without the real-time trace engine. But in other cases customers see the value of the trace hardware as justifying the permanent increase in die size. Such is the growing importance of software development in SoC projects.
ASIC Company Model
Conventional wisdom suggests that if you are doing a mainstream design, you can make life easier by working with an ASIC company. But if you know you are pushing the envelope, you are better off with a COT flow, even though it involves a lot more responsibility. At least you have control of your destiny and don't have to design within the walls of someone else's ASIC libraries.
That wisdom depends on just what sort of ASIC company you are talking about.
Suppose you put together an ASIC company with a different model: fabless instead of invested in fab ownership and process development, for instance, and stocked with experienced full-custom designers instead of traditional ASIC back-end gurus.
That is the model Silicon Value is trying. The company is fabless, and is built around a custom design team left behind in Israel by the dissolution of Digital Equipment Corp. Yes, more Alpha CPU veterans.
The result, according to SV, is a rather different kind of offering. Instead of an extensive library of IP and formidable back-end services, the company offers what sounds remarkably like a design-services model. Their full-custom designers engage with customers starting at RTL, bringing tools and skills developed in the Alpha days to bear on optimizing and implementing a design.
The company, which has been around awhile, focuses on designs that in the 180-nm generation would normally be 10- to 15-mm dice. The engagement can be simply teamwork in producing a tapeout, or it can include taping out, foundry relationships and actually providing silicon.
The process starts with RTL, where SV can instantiate devices such as data paths and multiport memories for which they have finely tuned generators, according to the company. SV likes to be sufficiently engaged that they can help keep logical and physical hierarchy the same at the top level, CEO Vacit Arat said. This makes for big advantages downstream.
As the design moves out of synthesis, SV's experience in custom design starts to show again. The company has tools that comb the netlist for repeated patterns of cells. These-much as in a data compression algorithm-become candidates for custom cell development, reducing power, improving performance and shrinking die area. Of course creating new merged or split cells necessitates new cell models at all levels and another Verplex verification pass, but the results can be well worth while, according to the company.
Another goody that DEC background brings is a custom-oriented clock and power distribution methodology. Remember, the Alpha processors were known for arc-welder size clock currents and absurdly small clock skews. It was done by using mesh, rather than tree, topology for clock and power distribution. The technique is gaining adherents in other places, but the SV team has been using it for years.
Other touches from the Alpha days are there as well, such as the habit of putting skeleton PN structures in all available open space. These can be used later, with just a metal-mask change, to implement additional gates for minor corrections in logic. Given how often a logic error is simply a polarity or clock choice problem, that could save a complete mask-set spin.
In any case, SV puts another data point in the market, somewhere between design services, conventional ASIC houses and full-custom design teams.
http://www.isdmag.com
© 2001 CMP Media LLC.
12/1/01, Issue # 13150, page 8.